Learning Representational Invariances for Data-Efficient Action Recognition
نویسندگان
چکیده
Data augmentation is a ubiquitous technique for improving image classification when labeled data scarce. Constraining the model predictions to be invariant diverse augmentations effectively injects desired representational invariances (e.g., invariance photometric variations) and helps improve accuracy. Compared data, appearance variations in videos are far more complex due additional temporal dimension. Yet, methods remain under-explored. This paper investigates various strategies that capture different video invariances, including photometric, geometric, temporal, actor/scene augmentations. When integrated with existing semi-supervised learning frameworks, we show our strategy leads promising performance on Kinetics-100/400, Mini-Something-v2, UCF-101, HMDB-51 datasets low-label regime. We also validate fully supervised setting demonstrate improved performance.
منابع مشابه
Convex Learning with Invariances
Incorporating invariances into a learning algorithm is a common problem in machine learning. We provide a convex formulation which can deal with arbitrary loss functions and arbitrary losses. In addition, it is a drop-in replacement for most optimization algorithms for kernels, including solvers of the SVMStruct family. The advantage of our setting is that it relies on column generation instead...
متن کاملLearning Generic Invariances in Object Recognition: Translation and Scale
Invariance to various transformations is key to object recognition but existing definitions of invariance are somewhat confusing while discussions of invariance are often confused. In this report, we provide an operational definition of invariance by formally defining perceptual tasks as classification problems. The definition should be appropriate for physiology, psychophysics and computationa...
متن کاملLearning Invariances for Policy Generalization
While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks. In this paper, we study a simple reinforcement learning problem and focus on learning policies that encode the proper invariances for generalization to different settings. We evaluate three potential methods...
متن کاملLearning Spatio-Temporal Invariances
We present a neural network model for the unsupervised learning of high order visual invariances. The model is demonstrated on the problem of estimating sub-pixel stereo disparity from a temporal sequence of unprocessed image pairs. After learning on a given image sequence, the model's ability to detect sub-pixel disparity generalises, without additional learning, to image pairs from other sequ...
متن کاملData Mining for Action Recognition
In recent years, dense trajectories have shown to be an efficient representation for action recognition and have achieved state-of-theart results on a variety of increasingly difficult datasets. However, while the features have greatly improved the recognition scores, the training process and machine learning used hasn’t in general deviated from the object recognition based SVM approach. This i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Social Science Research Network
سال: 2022
ISSN: ['1556-5068']
DOI: https://doi.org/10.2139/ssrn.4035476